The Discovery of Mathematical Probability Theory: a Case Study in the Logic of Mathematical Inquiry

نویسندگان

  • Daniel Gerardo Campos
  • Douglas Anderson
  • Emily Grosholz
  • Dale Jacquette
  • Catherine Kemp
  • Michael Rovine
  • John Christman
  • Charles Sanders
چکیده

ing from their actual experience with games of chance, a collection of possible outcomes of aleatory trials. This does not mean that the collection or complete enumeration of possible outcomes was directly perceived; it rather means that these inquirers were able to imagine the aleatory situation of dice-throwing, for example, in such a careful, attentive way so as to seize upon the complete enumeration of possible outcomes as the ‘mathematical form’ worthy of mathematical study, while disregarding those aspects of the ‘matter’ of the situation that were not of mathematical interest. 204 Among such mathematically unessential aspects of the ‘matter’ of dice-throwing to be disregarded we might include, say, the material of which the dice are made, the color of the dice, who the players are (so long as the game does not involve skill or skill can be assumed equal), whether the throw is rightor left-handed, and so on. I am suggesting, then, that the early mathematical probabilists discovered the fundamental probability set not by abstraction from a direct percept but rather by abstraction from an image. That is, they first imagined an aleatory situation under ideal conditions: perfect dice, equally skilled players, and so on. Then they abstracted from this image the conception of an enumerated collection of possible outcomes. This means that they transformed the possible individual imaginary outcomes of the idealized game into an abstract object, an enumeration or collection which, later in the history of mathematics, came to be conceived as a set. It is not important right now that the initial enumerations may have been incorrect for the purposes of estimating chances insofar as they did not distinguish between partitions and permutations. This distinction would be the later result of a more precise determination of the fundamental probability set in the context of the ‘analysis’ of problems in the calculus of chances, as we can see in Galileo’s ‘analytic’ work. The crucial point now is that the fundamental probability set was initially conceived, even if imprecisely, by way of abstraction from an idealized situation. Admittedly, this original image of a game under ideal conditions was the result of abstracting from actual experience with games. I would like to call this operation of abstracting an idea from actual perceptual experience an ‘idealizing abstraction’. Clearlying from actual experience with games. I would like to call this operation of abstracting an idea from actual perceptual experience an ‘idealizing abstraction’. Clearly the Peircean stance admits that ideas may arise by abstraction from perceptual 205 experience. But this does not mean that all ideas arise by idealizing abstraction from perceptual experience. Our first ideas, once conceived on the basis of experience, spur a continuous train of thought in the mind that, animated by the imagination, has a life of its own. Ideas engender ideas through the work of the imagination. Sometimes this imaginative work proceeds by way of abstraction. In the conception of a fundamental probability set, the imaginative work—literally, the making of a mathematical image or ‘sign’—consisted in abstracting the mathematical form from the image of idealized aleatory conditions. Once the concept of a fundamental probability set was abstracted, the course of mathematical thought upon this idea took a life of its own in the minds of a community of inquirers. 5.1.2 Framing Analogy Analogy is an important method for framing mathematical hypotheses. In a very general sense, analogy is the inference that if two things agree with each other in one or more respects, they will agree in other respects as well. The nature of the agreement between the analogous things—whether individual objects, an object and its sign, relations between objects, systems of relations, and so on—may be one of strict correspondence or of some degree of resemblance. If the agreement is a strict one-to-one correspondence between (i) all the elements and (ii) the patters of relations among the elements that constitute two different things, the analogy is an isomorphism. If the number of elements is not the same, but the structural patterns of relation between the 206 elements of the two things are the same, the analogy is a homomorphism. Less strictly, an analogy between two things may be a relation of stronger or weaker resemblance. Peirce considered analogy to be a fourth type of inference, a mixed case between induction and abduction. In a relatively early effort at discussing the various forms of ampliative inference, Peirce writes that “among probable inferences of mixed character, there are many forms of great importance. The most interesting, perhaps, is the argument from Analogy in which, from a few instances of objects agreeing in a few well-defined respects, inference is made that another object, known to agree with the others in all but one of those respects, agrees in that respect also” (CP 2.787). In this case, we might understand the analogy to be between a sample of completely examined objects and another partially examined object. I think that the Peircean view that analogy is a mixed case between induction and hypothesis or abduction may be elaborated as follows. In an analogy, there is an enumeration of the aspects in which two things correspond to, or at least resemble, each other. This sort of enumerative evidence is usually inductive—it consists in a sample of observations about particulars. However, the inference is not from particular instances to a probable general rule; it is rather a conclusion to another particular aspect in which the analogous things plausibly agree. No definite probability is attached to the conclusion of the inference by analogy. The inference only suggests what may plausibly be the case about one of the things under examination on the basis of its 120 However, later in the development of Peirce’s doctrines of logic, he seems to have come to classify analogy as a species of abduction. The question of what is the form of an analogy according to Peirce and how it ought to be classified is a subject that still requires a thorough investigation of its own. Here I will only try to discuss it in sufficient detail for my present purposes, though I openly admit that my interpretation of analogy may be considered as an incipient attempt at an elaboration of the Peircean model rather than as a faithful account of Peirce’s own explicit views. 207 partial agreement with the other, and this is similar to a ‘habitual’ abductive conclusion. In analogy, therefore, no general rule is inferred, as it is in ‘creative’ abduction. Analogy, remains at the level of knowledge of particulars, and so it is not an inference that leads to generalization. It does however tend to extend our knowledge—in the Peircean sense of ‘extension’ versus ‘generalization’—by concluding that an object may plausibly belong to a known class of objects. With specific regard to necessary diagrammatic reasoning, Peirce writes that “deduction consists in constructing an icon or diagram the relations of whose parts shall present a complete analogy with those of the parts of the object of reasoning, of experimenting upon this image in the imagination, and of observing the result so as to discover unnoticed and hidden relations among the parts” (CP 3.363; emphasis mine). Therefore, here Peirce claims that the mathematician constructs an icon or diagram for study by way analogy, where we might take a “complete analogy” to be an isomorphism, or at least a homomorphism, between the thing represented and the diagrammatic representation of the thing. In the larger context of his thought, I think we should take Peirce’s position to be that mathematical ‘diagrams’ are often constructed by analogy, though mathematicians have other methods of construction depending on what the specific reasoning context requires. Be that as it may, it is precisely as an extensive inference—that is, as an inference that adds “breadth” rather than “depth” to our knowledge—that Bernoulli uses an analogy to justify, in one of a series of arguments, the application of mathematical probability to the study of natural phenomena. As I discussed in section 4.1.1 , in his 1703-1704 correspondence with Leibniz, Bernoulli used an analogy between an urn filled with 208 pebbles and a human body containing sicknesses or diseased parts to warrant the application of the probability calculus to the study of natural events, such as diseases and storms. Bernoulli offers the analogy in response to his correspondent’s objection that the possible outcomes of natural events, unlike those of games of chance, are infinite—as Hacking points out, an objection that in contemporary terms amounts to claiming that there is no fundamental probability set for natural events (Hacking 1975, p. 163-164). Let me now expound Bernoulli’s inference by analogy. The initial step is to enumerate several characteristics with respect to which the urn and the human body “agree” or are analogous. Bernoulli lists two such characteristics in this case. Accordingly, the first premise of the analogy is that just as the urn contains white or black balls in a ratio that is unknown to us, so also the human body contains “the tinder of sicknesses within itself” (Bernoulli 1966, p. 76). For simplicity, we might interpret Bernoulli to mean that the human body contains healthy—analogous to white—and sick—analogous to black—parts in a ratio that is unknown to us. Bernoulli assumes that the ratio of sick to healthy parts provides an indication of the probability of death within a given time period. It is likely that Bernoulli conceived of this tinder box of sicknesses as containing various types of sicknesses, such as dropsy and plague, which he uses in other examples and which were commonly listed in mortality tables such as those gathered by John Graunt’s 1662 Bills of Mortality. This latter interpretation of course would complicate the analogy, since we would have to conceive of the various sicknesses as being analogues to balls of various colors within an urn. These complications 121 For a reprint of some summary statistics on causes of mortality from Graunt’s Bills, see David 1962, p. 101. 209 notwithstanding, the second premise is that just as it is possible to sample with replacement from the urn so as to determine empirically the ratio of white to black balls, so also is it possible to sample parts from the human body and observe them so as to determine empirically the ratio of healthy to sick parts. The next step is to point out some characteristic that is known to be true of the urn. Accordingly, the third premise is the claim, which Bernoulli believes to be warranted by his theorem, that on the basis of a mathematically determined sample size we can become “morally certain” that the experimentally observed ratio of white to black balls is as close to the true ratio as it is scientifically desired (Bernoulli 1966, p. 75). The final step is to draw the conclusion by analogy. Bernoulli thus infers that on the basis of a mathematically determined sample size we can also become “morally certain” that the empirical ratio of healthy to sick parts is as close to the true ratio as scientifically desired. Regardless of the factual and conceptual merits of the various premises, the most important point for our present purposes is that by way of this analogy Bernoulli extends the notion of the fundamental probability set. Just as there is a fundamental probability set in games of chance, so also is there a fundamental probability set in natural events. Bernoulli argues that mathematically it is irrelevant that in natural events the number of possible outcomes constituting the set is infinite since the ratio of two infinite quantities 122 For an eloquent discussion of the scientific and conceptual context that made the various premises and implicit assumptions of Bernoulli’s analogy plausible or implausible to his contemporaries, see Daston 1988, p. 230-253. Her main thesis is that Bernoulli created a model of causation that appealed to those of his contemporaries who wanted to uphold the possibility of practically certain scientific knowledge against skeptical arguments. The model of causation consisted in conceiving of a priori probabilities as the unknown causes of observed effects, namely, a posteriori statistical ratios. Thus it was possible to reason scientifically from effects (observed frequencies) back to causes (unobserved a priori probabilities) and to obtain scientifically practical certainty about the hidden causes of observed phenomena. 210 may nevertheless be finite, as the notion of a limit shows. Bernoulli’s analogy, therefore, extends the general concept of the fundamental probability set to as to be applicable in the particular case of natural events. Daston explains that “Bernoulli’s appropriation of the urn example to describe the processes linking inaccessible causes [i.e. a priori probabilities] to observed effects [i.e. statistical frequencies] expanded not only the domain of problems upon which probabilists might test their skills, but also the conceptual tools for extending the range of the theory’s applications still further. By likening situations as disparate as the diseases that afflict the human body...to drawing white and black balls at random from an urn, probabilists hoped to free their theory from its preoccupation with gambling puzzles. Bernoulli’s urn model of causation set the pattern for other applications of classical probability theory” (Daston 1988, p. 238). This is an exemplary instance, therefore, of the significant re-creation by an analogical extension of a ‘framing hypothesis’. Further studies into historical cases ought to lead to a more comprehensive list of methods for the making of framing hypotheses. For now, however, let us turn to examine cases of ‘analytical’ hypothesis making. 5.2 The ‘Analytic Method’ of Mathematics and the Heuristics of ‘Analytical Hypothesis-Making’ I call ‘analytical hypotheses’ those that the mathematicians conceive as plausible solutions to a problem. In Peircean terms, these hypotheses consist in experimental modifications to an existing ‘diagram’—be it a geometrical figure, an algebraic equation, 123 Recall that I have already argued in section 4.3 that the imaginative creation or re-creation of ‘framing hypotheses’ ought to be considered an inextricable aspect of mathematical reasoning. 211 an so on—in order to derive another ‘diagram’ representing the sought mathematical result. Again, I will examine some salient cases from the early history of mathematical probability so as to (i) clarify in general what I, following Cellucci, mean by the ‘analytic method’ of mathematics, and (ii) define and illustrate some of the most important heuristic methods for making ‘analytical hypotheses’ outlined by Peirce and Cellucci. 5.2.1 Lessons from Huygens’s General Method of Solution for the Problem of Points After his 1655 visit to Paris, Huygens set to work on various problems on the mathematics of chance, including the problem of points, independently of Pascal and Fermat. His work is highly significant since, as David proclaims, “[t]he scientist who first put forward in a systematic way the new propositions evoked by the problems set to Pascal and Fermat, who gave the rules, and who first made definitive the idea of mathematical expectation was Christianus Huygens” (David 1962, p. 110). In his 1657 De Ratiociniis in Aleae Ludo, Huygens put forth the first systematic treatment of the mathematics of chance, and this work became the standard text for studying the elements of the doctrine of chances. It was subject to various English translations, one of them by John Arbuthnot, and Jacob Bernoulli included it, with his own annotations, as part I of the Ars Conjectandi. The main body of Huygens’s De Ratiociniis in Aleae Ludo consists of the following fourteen propositions: I: To have equal chances of getting a and b is worth (a + b) / 2. II: To have equal chances of getting a, b or c is worth (a + b + c) / 3. 212 III: To have p chances of obtaining a and q of obtaining b, chances being equal, is worth (pa + qb) / (p + q). IV: Suppose I play an opponent as to who will win the first three games and that I have already won two and he one. I want to know what proportion of the stakes is due to me if we decide not to play the remaining games. V: Suppose that I lack one point and my opponent three. What proportion of the stakes, etc. VI: Suppose that I lack two points and my opponent three, etc. VII: Suppose that I lack two points and my opponent four, etc. VIII: Suppose now that three people play together and that the first and second lack one point each and the third two points. IX: In order to calculate the proportion of stakes due to each of a given number of players who are each given numbers of points short, it is necessary, to begin with, to consider what is owing to each in turn in the case where each might have won the succeeding game. X: To find how many times one may wager to throw a six with one die. XI: To find how many times one should wager to throw 2 sixes with 2 dice. XII: To find the number of dice with which one may wager to throw 2 sixes at the first throw. XIII: On the hypothesis that I play a throw of 2 dice against an opponent with the rule that if the sum is 7 points I will have won but that if the sum is 10 he will have won, and that we split the stakes in equal parts if there is any other sum, find the expectation of each of us. 213 XIV: If another player and I throw turn and turn about with 2 dice on condition that I will have won when I have thrown 7 points and he will have won when he has thrown 6, if I let him throw first find the ratio of my chance to his. Among these propositions, the most important for our discussion will be number IX because it provides a general rule for the solution of any particular version of the problem of points. I will detail the heuristic method that Huygens employs to derive this rule. Prior to that specific discussion, however, let us turn to consider briefly what Huygens’s treatise reveals in general about the ‘analytic method’ of mathematical inquiry. 5.2.1.1 The ‘Analytic Method’ of Mathematics Carlo Cellucci has recently argued in favor of a philosophical view of mathematics as an open-ended, heuristic practice instead of what he calls the ‘foundationalist’ view of mathematics as a closed-ended body of knowledge completely determined by self-evident axioms (see Cellucci 2000 and 2002). In that context, he argues that the ‘heuristic’ view reveals that the actual method of mathematical inquiry is ‘analytic’ instead of ‘axiomatic’. Actual mathematical inquiry does not proceed by way of mechanical deduction from self-evident principles and axioms. Some mathematical theories might exhibit an axiomatic structure once they are developed, but at this point they are “dead,” so to speak; established, axiomatized theories are no longer an actual, 124 I have listed the propositions as translated in David 1962, p. 116-117. 214 living matter of inquiry. Mathematical inquiry rather proceeds by way of analytical problem-solving. According to Cellucci, “the analytic method is the procedure according to which one analyzes a problem [that is, brakes it into constituent problems, or reduces it to another problem, and so on] in order to solve it and, on the basis of such analysis, one formulates a hypothesis. The hypothesis constitutes a sufficient condition for the solution of the problem, but it is itself a problem that must be resolved. In order to resolve it, one proceeds in the same way, that is, one analyzes it and, on the basis of such analysis, formulates a new hypothesis. [Thus, analysis] is a potentially infinite process” (Cellucci 2002, p. 174). Under this view, therefore, the search for an absolute foundation to mathematical knowledge is vain. To cast mathematical axioms as self-evident truths that serve as absolute foundations for mathematical knowledge is to curtail the actual process of analytical inquiry. Moreover, in as much as the analytic “passage from the given problem to a hypothesis that constitutes a sufficient condition for its solution is configured as a reduction from one problem to another, the analytic method is also called the method of reduction” (p. 175). And in as much as the analytic method requires formulation of a hypothesis for the solution of a problem, it “is also called the method of hypothesis” (p. 177). Analysis, then, consists in reasoning processes that we might very broadly conceive as ‘reduction’ and ‘hypothesis-making’. Now, I have argued in section 2.4 that, in my estimation, the Peircean conception of mathematics is open-ended, heuristic, and advocates the analytic method in the 125 If I may venture a metaphor, I think of these theories as a kind of fossil record of what once was a living matter. 126 All translations from this work are mine. 215 Cellucian sense since it views mathematics as an inquiring practice in which the mathematician solves ideal problems by way of analytical hypothesis-making. These ideal problems are framed within hypothetical states of affairs that often serve as representative models for ‘actual’ problems in nature. Moreover, ‘framing hypotheses’ in the Peircean sense need not be axioms, and even when an ideal mathematical system is axiomatized, the axioms are simply hypotheses that may be reconceived as the process of mathematical inquiry demands. I want to submit now that the ‘heuristic’ view of mathematics, with its endorsement of the analytical method, explains well the type of mathematical practice that Huygens’s treatise reveals. There are no axioms serving as the foundation of Huygens’s De Ratiociniis in Aleae Ludo. There is rather a series of propositions—all framed by the general idea of the fundamental probability set—that actually stand for problems of chance and expectation. In order to solve them, Huygens analyzes them, reducing them to other problems and posing hypothetical solutions; that is, in Peircean terms, experimenting with the mathematical models or ‘diagrams’ that represent the problem and observing the results of experimentation. The solution to each problem, in turn, suggests new problems for investigation. The analytical process, then, gradually leads to a “deepening” of knowledge on the mathematics of chance. For example, even without discussing the details here, we might easily imagine that the solution to the problem stated in proposition VII could proceed by analyzing this problem into those problems already solved in the immediately preceding propositions. And as we shall see in detail shortly, proposition IX is a general problem that can be analyzed into simpler problems that are either of easy solution or already solved in previous propositions, 216 especially II and VIII. Moreover, in his general treatment of the problem of points in proposition IX Huygens assumes that all players have equal chances of winning each game. This suggests a new, more general, problem: what if the players do not have equal chances of winning each game? Abraham de Moivre takes up this problem and offers an even more general solution to the problem of points in his 1718 Doctrine of Chances. We find in Huygens’s treatise, then, not an axiomatized theory but a series of interrelated problems regarding the calculus of chance whose solutions eventually lead Huygens to offer general rules for the solution of similar problems, such as the general method for solving particular problems of points stated in proposition IX. And the same analytical process is taken up by other inquirers, so that the analytical method does tend towards increasingly more general problems, potentially ad infinitum. Now, proponents of the ‘foundationalist’ view of mathematics as an affair of deduction from self-evident axioms might of course deny that Huygens’s treatise is properly a mathematical work. Daston in fact observes that even though “the famous correspondence between Blaise Pascal and Pierre Fermat first cast the calculus of probabilities in mathematical form in 1654, many mathematicians would argue that the theory achieved full status as a branch of mathematics only in 1933 with the publication of A. N. Kolmogorov’s Grundbegriffe der Wahrscheinlichkeitsrechnung. Taking David Hilbert’s Foundations of Geometry as his model, Kolmogorov advanced an axiomatic formulation of probability based on Lebesgue integrals and measure set theory. Like Hilbert, Kolmogorov insisted that any axiomatic system admitted ‘an unlimited number of concrete interpretations besides those from which it was derived,’ and that once the axioms for probability theory had been established, ‘all further exposition must be based 217 exclusively on these axioms, independent of the usual concrete meaning of these elements and their relations’” (Daston 1988, p.3). Under such a ‘foundationalist’ view, therefore, the work of all the early probabilists, including Huygens, may be regarded as a non-mathematical, even if scientific, attempt at providing quantified models of chance phenomena, but not as mathematical theorizing proper. They may concede Daston’s own view that “the link between model and subject matter is considerably more intimate than that between theory and applications” so that, even in the eyes of the early probabilists, the field of mathematical probability was “a mathematical model of a certain set of phenomena, rather than...an abstract theory independent of its applications” (Daston 1988, p. xii). Even conceding this, however, the “foundationalists” would not confer upon early mathematical probability the seemingly privileged rank of a theory. From a Peircean perspective, the distinction between model and theory may be of philosophical interest for understanding the structure of mathematical and scientific knowledge, but it is not relevant for determining whether the early probabilists were acting and reasoning as mathematicians. Foundationalist philosophers of mathematics may impose their conceptions of mathematics on early mathematical probability in order to battle as much as they want about whether it is a model or a theory. However, from a Peircean open-ended, heuristic perspective, what marks the probabilists’ reasoning as genuinely mathematical is that they were creating ideal states of affairs and studying what would be true about them. Whether the results of studying these hypothetical states of affairs amounted structurally to models or to theories is beside our point of interest. Nevertheless, let me state that I think that no deeper understanding of mathematics is gained by arbitrarily circumscribing the notion of mathematical ‘theory’ to axiomatized 218 systems of propositions. If anything, it promotes the erroneous idea that mathematics is the dead stuff printed in a certain kind of textbook. My inclination is to say that a mathematical ‘theory’ is a purely ideal system while a mathematical ‘model’ is a system that represents an ‘actual’ problematic phenomenon. In Peircean terms, a theory is a ‘pure icon’ while a model is a ‘symbolic icon’. Qua pure mathematicians, the early probabilists were creating a theory; qua applied mathematical scientists, they were modeling aleatory phenomena. Be that as it may, what is crucial to us is that the ideal systems of early mathematical probability were open-ended and subject to reconception and revision, as problem-solving demanded and as mathematical theorizing and the modeling of actual chance phenomena dictated. Whether theorizing or modeling, their activity was thoroughly mathematical, and it proceeded by problem-solving and hypothesismaking. Huygens’s work testifies to this, as we shall now see. 127 Preferably one without any figures, actual diagrams, pictures, conjectures or wild guesses. See, for instance, James Robert Brown’s discussion of the Bourbaki group in French mathematics, which equates the highest standards of rigor with a thorough refusal to use any pictures or figures or other heuristic aides in their demonstrations (Brown 1999, p. 172-173). The immediate Peircean reply, of course, is that the members of the Bourbaki group only take themselves not to be working with ‘diagrams’ even though their algebraic and analytic expressions are also mathematical ‘signs’ usually equivalent to other “pictorial” forms of representation. 128 To be bold, I would say that mathematics is not what philosophers, or mathematicians with philosophical dogmas, define it to be according to their precepts, proceeding then to exclude from mathematics anything that does not fit those precepts: mathematics is what mathematicians actually do. And what mathematicians actually do is recorded in its history as well as enacted in living research. This is why I think philosophers of mathematics ought to look at the history of the subject in order to understand its nature. Otherwise, the discussion over this or that arbitrary prescription of what mathematics is becomes extremely uninteresting. 219 5.2.1.2 Generalization and Particularization as Analytical Heuristics Proposition IX provides a general rule for the solution of the problem of points. Let me first expound Huygens’s demonstration and then discuss what it reveals about analytical heuristics. Again, the proposition is the following: In order to calculate the proportion of stakes due to each of a given number of players who are each given numbers of points short, it is necessary, to begin with, to consider what is owing to each in turn in the case where each might have won the succeeding game. To demonstrate it, Huygens reasons as follows. (I will insert my annotations in parentheses.) He supposes that there are three players, A, B, and C, and that A lacks one game, B two games, and C three games in order to win the match. (That is, he begins by considering a particular problem of points.) He begins by trying to find the proportion of stakes due to B, calling the sum of stakes q (which serves as an algebraic unknown), if either A, or B himself, or C wins the first succeeding game. There are, therefore, three cases to consider. (a) If player A were to win the next game, then the match would end and consequently the sum due to B is 0 (i.e. B is due 0q). (b) If B were to win the next game, he would therefore lack 1 game, while A and C would still lack 1 and 2 games respectively. Therefore, by proposition VIII, B is due 4q/9. (Imagine the “Fermatian table” of equipossible outcomes for the ensuing situation. There would be 4 out of 9 possible outcomes that would favor player B. Huygens does not construct a Fermatian 129 My rendition of Huygens’s reasoning is a lose translation of his demonstration as reprinted in Bernoulli 1713, p. 18-19. 220 table, but the exercise allows us to understand the result given our previous discussion of the Fermatian method in the Pascal-Fermat correspondence.) (c) Lastly, if C were to win the next game, then he would lack 1 game, while A and B would still lack 1 and 2 games respectively. Consequently, by proposition VIII, B is due 1q/9. (Again imagine the “Fermatian table” of equipossible outcomes for the ensuing situation. There would be only one out of nine possible outcomes that would favor player B.) Moreover, if we “colligate in one summation,” that is, if we add, that which in each of the three cases is due to B, namely 0, 4q/9, and 1q/9, the result is 5q/9. Dividing this sum by 3, which is the number of players, the result is exactly 5q/27. By proposition II this is the “part sought,” that is, the proportion of the total stakes that is due to B. (Had we diagrammed the “Fermatian table” for the particular version of the problem of points that Huygens considers, we would have found that there are twenty-seven equipossible outcomes, out of which only five outcomes favor player B.) As if to leave completely clear his reasoning, Huygens restates his conclusion that since B would obtain either 0, 4q/9, or 1q/9, then by proposition II the proportion of stakes due to B is “0 + 4q/9 + 1q/9 : 3” or 5q/27. (At this point, Huygens derives a general rule for solving the problem of points from the foregoing solution to one particular version of the problem of points.) Therefore, Huygens argues, one must consider in any problem whatsoever, clearly in the preceding one or in any other version of the problem, what is due to each player in the case where each might win the next game. (In the previous particular problem, we would find by the same method that A is due 17q/27 and C is due 5q/27.) For just as one cannot solve the preceding problem until we “subduce” it under the calculations already done for 221 proposition VIII, so also we cannot solve the problem in which the three players lack 1, 2, and 3 games respectively until we calculate how the stakes ought to be distributed when: (i) they lack 1, 2, and 2 games respectively, which is the preceding problem just solved, and (ii) they lack 1, 1, and 3 games respectively, which is the problem already solved in proposition VIII. (Note that when (iii) they lack 0, 2, 3 games respectively, the solution is trivial since A gets all of the stakes. This is why Huygens’s does not list it.) Huygens provides a table that “comprehends” the calculations for each subsequent particular problem of points, up to the problem in which A, B, and C lack 2, 3, and 5 games respectively, noting that the particular solutions can be extended. (By providing the table, Huygens emphasizes that his general rule will work no matter how complex the particular problem of points under study.) Allow me to draw out now what Huygens’s demonstration reveals about mathematical demonstration via analytical heuristics. I find in Huygens’s demonstration the five-stage process of necessary reasoning that Peirce outlines. (i) Huygens expresses a hypothesis in general terms, in this case, his proposed general rule for solving the problem of points. This proposition becomes a problem that ought to be resolved. (ii) Huygens creates a concrete ‘diagram’ or mathematical icon to represent the general proposition; in this case, the ‘diagram’ consists in a particular, definite, play situation that can be represented by words, Fermatian tables, and so on. (iii) Huygens experiments upon the ‘diagram’ by imagining possible changes to it, in this case, by considering three possible modifications that correspond to each of the three players winning the next game. As I already noted, each of the three ensuing situations can then be represented by three new ‘diagrams’, say, by three “Fermatian tables” of the resulting play situations. 222 (iv) Huygens observes the results of the experimental modification of the original ‘diagram’ and grasps that the experiment solves the problem, i.e. ‘de-monstrates’ the original general proposition. (v) Finally, Huygens generalizes the solution, even providing a complex table that we might consider a composite diagram of multiple play situations and that we understand to be but a part of an infinitely composite diagram. Let us now delve into the ‘analytical’ stages of the process. In this example, stage (ii) consists in a ‘particularization’ of the problem of points. Cellucci defines heuristic particularization as “the inference by way of which one passes from one hypothesis to another one that it contains as a particular case” (Cellucci 2002, p. 267). We might state the general problem of points as follows: Given that players A, B,..., X, Y, Z lack a, b,..., x, y, z points respectively to win the match, find the proportion of the total stakes q that is due to each one of them. Huygens’s finds that trying to find a general rule of solution directly from this general statement of the problem is too difficult. Thus he particularizes the general problem, in effect by constructing a particular ‘diagram’ of it, and proceeds to solve the particular version. Experimental stage (iii) consists in ‘reducing’ the particular ‘diagram’ of the problem into three alternative ‘diagrams’ of problems that have already been solved. ‘Reduction’ in this sense simply means to resolve the present problem into one or more alternative problems whose solutions, when composed or linked in some suitable way, are sufficient for solving the original one. In this case, Huygens reduces the problem in which players A, B, and C lack 1, 2, and 2 games respectively into three alternative problems: how to divide the stakes when (a) they lack 0, 2, and 2 games; (b) they lack 1, 1, and 2 games; and (c) 1, 2, and 1 games. Case (a) has a trivial solution, and cases (b) 223 and (c) have already been solved in proposition VIII. Additionally, proposition II provides the rule by which the original problem can be solved in terms of the solutions to cases (a), (b), and (c). In stage (iv) Huygens’s “sees” or “grasps” that the method of solution is general: it can be applied to any particular problem and it will lead to the correct solution. Equivalent modifications of the original diagram in any play situation will yield the correct response regarding the fair distribution of stakes. Cellucci defines heuristic generalization as “the inference by way of which one passes from one hypothesis to another one that contains it as a particular case” (Cellucci 2002, p. 267). I submit that Huygens “grasps” the generality of the rule quickly due to his vigorous power of generalization. Any mathematician with a lesser power of generalization, however, could arrive at the same generalization by conducting other diagrammatical experiments. It is in this sense that Peirce argues that necessary mathematical reasoning is inductive. The mathematician could experiment with problems in which there are, say, four players that lack 1, 1, 1, and 2 points. She could create a diagram of this play situation and resolve it by the same method into the various possible alternative diagrams. Still she would find that Huygens’s general rule works. Thus, mathematicians arrive at inductive generalizations more or less quickly according to whether their power of generalization is more or less vigorous. Accordingly, Huygens emphasizes the generality of his method by providing a table with the solution to more complex games. No matter how complex the diagram his general method works, and his readers can confirm it inductively by conducting alternative experiments themselves. In this way, stage (v) is completed and the process of demonstration ends. 224 In sum, I propose that in the course of this demonstration Huygens reasons analytically, deploying the heuristic experimental techniques of ‘particularization’, ‘reduction’, and ‘generalization’ to solve a problem and, consequently, to “demonstrate necessarily” what originally stood as a hypothetical proposition. Accordingly, I find in this example a confirmation that the method of mathematical inquiry is analytical and experimental—in the Cellucian and Peircean senses that I have expounded—and an illustration that ‘particularization’, ‘reduction’, and ‘generalization’ are among the key heuristic techniques of research that mathematicians ought to cultivate. Let us turn to learn other heuristic methods of analysis from Bernoulli’s proof of his famous law of large numbers. 5.2.2 Lesson’s from the Demonstration of Bernoulli’s Theorem In his Ars Conjectandi, Bernoulli states and demonstrates the first law of large numbers—arguably the most important breakthrough of early mathematical probability theory. Bernoulli sought to find a way to estimate the probabilities of events on the basis of empirical observation in situations where the probabilities cannot be estimated a priori. In the next chapter, I will discuss in detail the logical implications of Bernoulli’s mathematical findings and of his arguments to warrant the application of mathematical probability theory, on the basis of his theorem, to problems in the natural and social sciences. Right now I will discuss the strictly mathematical problem that Bernoulli poses and resolves. It is, in fact, a two-fold problem. First, Bernoulli wants to show that as the 225 number of empirical observations increases and tends towards infinity, the observed statistical frequency of a phenomenon approaches its a priori probability without asymptotical bound. He claims that by an “instinct of nature” we know that the more observations we have the lesser the risk involved in estimating the a priori probability from the a posteriori ratio. However, what we know by instinct requires mathematical demonstration (Bernoulli 1966, p. 38). That is, Bernoulli hypothetically poses, on the basis of an instinct of nature, a mathematical proposition, and this proposition becomes an ‘analytical’ problem that he must resolve. Stephen Stigler observes that the empirical approach to the determination of chances was not new with Bernoulli, nor did he consider it to be new. What was new was Bernoulli’s attempt to give formal treatment to the vague notion that the greater the accumulation of evidence about the unknown proportion of cases, the closer we are to certain knowledge about that proportion” (Stigler 1986, p. 65). Bernoulli approaches the problem formally by trying to show that as the number of empirical observations of a binomial event tends towards infinity, the probability that the difference between the a priori probability and the observed frequency is less than any small number approaches 1 (or maximum probability). In contemporary terms, Bernoulli seeks to prove the following hypothesized mathematical proposition: Let p be the probability of a successful event E being the outcome on any chance experimental trial. Let n be the number of experimental trials, x be the number of successes in n trials, and sn = x/n be the proportion of successes in n trials. Bernoulli wants to justify mathematically that sn is a good estimate of p by 226 showing that for any small positive number ε, the probability P( | p sn | < ε ) → 1 as n → ∞. Second, Bernoulli seeks to show that the required number of experimental trials may be specified mathematically in order to ensure that the empirical estimate is as close to the a priori probability of an event as it is desired. That is, Bernoulli wants to show that, for any given small probability δ, n can be specified such that P( | p – (x/n) | ≤ ε ) > (1 – δ). Continuing with our foregoing notation, we may express Bernoulli’s statement of the problem as follows: Show that n may be specified such that, for any given large positive number c, P( | p (x/n) | ≤ ε ) > c P( | p (x/n) | > ε ). Stigler emphasizes that Bernoulli proved more than just the first law of large numbers. Had Bernoulli only considered the first part of the problem, it would be strictly fair to call his theorem just the first “weak law of large numbers.” However, mainly on account of the second part, Bernoulli’s “actual result was deeper, subtler, more precise, more difficult, and more ambitious than the simple and elementary statement of the weak law of large numbers” (Stigler 1986, p. 66). Bernoulli not only demonstrated formally that as the number of observations increases to infinity the probability that the difference between the observed frequency and the a priori probability is arbitrarily small tends to 1, but he also showed how to determine the number of observations required in order to 130 See Hacking 1971b, p. 221-222, and Stigler 1986, p. 63-70. I am following Hacking’s (1971b) notation now in order to engage his discussion of the problem more easily in the next chapter. 131 I adapt this expression from Stigler 1986, p. 66. Stigler points out several possible mathematical and conceptual pitfalls related to stating Bernoulli’s result in contemporary probabilistic terms. Among these, the most salient is that Bernoulli only treats the case where the numbers of successes, r, and failures, s, are integers, and not with the contemporary situation in which the ratio p = r / (r + s) ranges over all the real numbers in the interval [0, 1]. For the details, see p. 66-67. 227 attain a desired level of accuracy in the statistical estimate. With this in mind, let us turn to discuss Bernoulli’s attack on this two-fold analytical problem. Bernoulli’s demonstration of the theorem is extensive and considerably longer than contemporary proofs of the weak law of large numbers. Stigler reconstructs Bernoulli’s mathematical proof in contemporary terms with clarity and succinctness (see Stigler 1986, p. 67-70). Accordingly, my aim here is not to retrace the path already covered by Stigler nor to reconstruct the entirety of Bernoulli’s proof. I will rather expound Bernoulli’s heuristic attack on the problem, emphasizing and explaining the ‘analytical’ character of his reasoning. Bernoulli states the theorem, or problematic proposition in need of proof, as follows: I will call those cases in which a certain event can happen successful or fertile cases; and those cases sterile in which the same event cannot happen. Also, I will call those trials successful or fertile in which any of the fertile cases is perceived; and those trials unsuccessful or sterile in which any of the sterile cases is observed. Therefore, let the number of fertile cases to the number of sterile cases be exactly or approximately in the ratio r to s, and hence the ratio of fertile cases to all the cases will be r / (r + s) or r / t [letting t = r + s], which is within the limits (r + 1) / t and (r – 1) / t. It must be shown that so many trials can be run such that it will be more probable than any given times (e.g., c times) that the number of fertile observations will fall within these limits rather than outside these limits—i.e., it will be c times more likely than not that the number of fertile observations to the number of all the observations will be in a ratio neither greater than (r + 1) / t nor less than (r – 1) / t. (Bernoulli 1966, p. 60-61) Bernoulli sets out to show, then, that it is possible to determine a number of trials so that the empirical statistical frequency of an event will be within two bounds around the true probability—the more the observations, the tighter the bounds. 228 Bernoulli’s strategy is to ‘reduce’ the probabilistic problem to a problem stated in terms of the binomial expansion. The ‘reduction’ works as follows. Recalling that r is the number of “fertile” or successful trials and s is the number of “sterile” or unsuccessful trials, let nt = n (r + s) be the total “number of observations taken.” This means that there are n binomial experiments and that in each of those there are t trials of which r are successes and s are failures. Now, expand the binomial (r + s) and relate each of the terms of the expansion divided by t with the “expectation” [expectatio] or “probability” [probabilitas] of a given ratio of successful and unsuccessful trials. That is, expand the binomial to get the expression: r + (nt / 1) (r) s + [(nt (nt-1)) / (1*2)] (r) s + [(nt (nt-1) (nt-2)) / (1*2*3)] (r) s + ... + (nt/1) (r) s + s. Then, on the basis of proposition XIII of the first part of the Ars Conjectandi, Bernoulli can deduce that each of the terms of the expansion divided by t gives the probability of a given ratio of successes and failures. That is, (r / t) gives the probability that all the trials are successes; [(nt / 1) (r) s] / t gives the probability that all but one of the trials are successes, and so on. Consequently, eliminating the common denominator t, it is possible to identify the first term of the expansion, r, with the number of experiments where there are only successful trials; the second term of the expansion, (nt / 1) (r) s, with the number of experiments in which all the trials but one are successes; the third term, [(nt (nt-1)) / (1*2)] (r) s, with the number of experiments in which all but two 132 Here we see the influence that the existing knowledge in mathematics has on mathematical research. Newton’s proof of the binomial theorem enabled a more powerful strategy of proof by Bernoulli. 229 trials are successes; and so on, until the last term, s, is identified with the number of experiments in which all the trials are failures. Having expressed all the possible outcomes in terms of the binomial expansion, Bernoulli can now solve the probabilistic problem exclusively in terms of the expansion and of some properties of infinite series. However, prior to detailing the ensuing analytical solution, I want to characterize the nature of Bernoulli’s ‘reduction’ of the probabilistic problem so far. Bernoulli has created an isomorphism between the collection of all the possible outcomes of the experiments, i.e. of what we call the ‘fundamental probability set’, and the binomial expansion. All the possible outcomes of the nt experiments are counted in the binomial expansion, and each of the terms of the expansion represents the number of ways in which each possible combination of r successes and s failures can occur. Moreover, the ratio of each of the terms to the common denominator t represents the probability of each of the possible events. A simple example will clarify the matter. Suppose there is two-sided die, with one white and one black face. Let the experiment be to throw the die twice. Perform the experiment twice. Now determine the probability of each of the possible events. In this case, let r = number of “white” throws and s = number of “black” throws. The number of throws, or trials, per experiment is t = 2, and the number of experiments, or double-throws of the die, is n = 2. All the elements of the fundamental probability set and the probabilities of all the possible events are represented in the following binomial expansion: (r + s) = (r + s) = (r + s) = r + 4rs + [(4*3)/(1*2)]rs + 4rs + s. 133 It is important to emphasize that Bernoulli carried out considerable original work on infinite series and that a treatise on the subject was included as an appendix to the Ars Conjectandi. 230 There are five terms in the expansion corresponding to five possible events, namely: (1) The coefficient of r indicates that there is 1 possible way to throw four whites; (2) the coefficient of rs indicates that there are 4 possible ways to throw three whites and one black; (3) the coefficient of rs indicates that there are 6 possible ways to throw two whites and two blacks; (4) the coefficient of rs indicates that there are 4 possible ways to throw one white and two blacks; (5) and the coefficient of s indicates that there is 1 possible way to throw four blacks. Since the total number of possible outcomes is t = 2 = 16, the probability of each event can be obtained by dividing the number of possible outcomes favoring each of the events above by 16. Now, historically Bernoulli is not the first to the use of the binomial expansion to treat combinatorial problems in probability. However, he seems to have conceived of this method on his own. Daston, for instance, writes that “[a]lthough much of Chapter 3 of Part II of the Ars conjectandi duplicated the results of Pascal’s Traité du triangle arithmétique (1665), Bernoulli appears not to have known of Pascal’s work. However, Bernoulli produced a ‘table of combinations’ that is essentially the arithmetic triangle, and proceeded to investigate its ‘truly curious and surprising’ properties, including the familiar derivation of the coefficients of the binomial expansion” (Daston 1988, p. 235). Moreover, Bernoulli’s analysis of the problem under discussion in terms of the binomial expansion reveals a vigorous and original imagination. 231 In sum, the isomorphism between the elements and relations involved in the probabilistic problem and the elements and relations represented in the binomial expansion allow Bernoulli to treat the problem in terms of relevant properties of the expansion itself. In Peircean terms, thus far (i) Bernoulli has expressed a hypothesis in general terms by proposing the theorem, and (ii) he has created a concrete ‘diagram’ or mathematical icon to represent the situation described in the hypothesis. The binomial expansion is a ‘diagram’ of the fundamental probability set designed with the express purpose of exhibiting clearly the relations among elements of that set that are relevant to the problem at hand. The next stage in the reasoning is (iii) to experiment upon the ‘diagram’ by introducing suitable changes to the algebraic expression and exhibiting relations between the terms that compose it so as to derive the desired result. From a Peircean perspective, I submit that Bernoulli proceeds to reasoning stage (iii) by way of the following analysis. Rewrite the binomial expansion of (r + s) as: r + (nt/1) (r) s + ... + Ln + ... + M + ... + Rn + ... + (nt/1) (r) s + s, where M is the largest term in the expansion, and Ln and Rn are the terms that are n places to the left and to the right of M respectively. As I have already pointed out, this ‘diagram’ is an orderly schema of all the possible cases that make up the fundamental probability set. But this recasting of the binomial expansion already poses an ancillary problem. Bernoulli needs to show that the largest term M represents the number of possible experimental outcomes that result in nr successes and ns failures. He proves this in 134 Bernoulli formally presents the proof by stating and proving five lemmas and then invoking them in the demonstration of the main theorem. But I submit that his actual inquiry did not follow this order. He rather first analyzed the main problem into ancillary problems that he proceeded to resolve in the lemmas. 232 lemma 3, with the support of lemmas 1 and 2. Thus, M represents the total number of possible cases in which the true ratio r:s is empirically observed. Bernoulli’s next step in his ‘diagrammatical experiment’ consists in resolving or literally breaking this schema into two component parts—for convenience of exposition, I will call them the ‘central’ and ‘peripheral’ parts. The ‘central’ part consists of the terms Ln, ..., M, ..., Rn. The ‘peripheral’ part consists of the terms r, (nt/1) (r) s, ..., Ln-1, and Rn+1, ..., (nt/1) (r) s, s. Now, add all of the terms of the ‘central’ part, Ln + ... + M + ... Rn, and all of the terms of the ‘peripheral’ part, r + (nt/1) (r) s + ... + Ln-1 + Rn+1 + ... + (nt/1) (r) s + s. Now observe the result of the diagrammatical experiment so far (recall that observation is stage (iv) in the Peircean model of mathematical reasoning). Bernoulli already proved that M represents the number of possible cases that result in nr successes and ns failures. Now, observe that since Ln and Rn are the terms that are n places to the left and to the right of M respectively, they represent the number of cases in which either nr + n or nr – n trials are “fertile”—while the rest of the trials are “sterile”—respectively. Therefore, observe that the sum Ln + ... + M + ... Rn represents the number of possible cases in which no more than nr + n and no less than nr – n trials are successful, while the sum r + ... + Ln-1 + Rn+1 + ... s represents the number of possible cases in which either more than nr + n or less than nr – n trials are successful. Recalling the isomorphism already described, observe also that the former sum of ‘central’ terms represents the number of possible cases in which the observed experimental outcomes approximate the true ratio r : s, while the latter sum of ‘peripheral’ terms represents number of possible cases in which the observed experimental outcomes deviate from the true ratio r : s. 233 On the basis of all the foregoing experimentation and observation, we see that what Bernoulli needs to show is that n can be chosen so large that (Ln + ... + M + ... Rn) > c (r + ... + Ln-1 + Rn+1 + ... s) for any desired c. This analytical recasting of the problem again poses ancillary problems that need to be resolved. First Bernoulli needs to show that in the expansion of a binomial of power nt, n can be chosen so that the ratio of M to Ln and the ratio of M to Rn will be greater than any given ratio. This he proves in lemma 4. Then Bernoulli needs to show that in the same binomial expansion, n can be chosen so that ratio of the sum of all the terms from M up to and including Ln (or Rn) to the sum of all the terms beyond Ln (or Rn)—that is, the ratio of ‘central’ to ‘peripheral’ terms to the left of right or M—is greater than any given ratio. Both of these proofs reflect Bernoulli’s powerful ingenuity and deep and subtle knowledge of infinite series. By lemmas 4 and 5, then, Bernoulli can deduce that a power nt of the binomial r + s can be chosen so large that (Ln + ... + M + ... Rn) > c (r + ... + Ln-1 + Rn+1 + ... s) for any c. Observe that this solves the original problem 1, whose result is the law of large numbers. Divide both sides of the inequality by the total number of possible outcomes, t. The resulting inequality of probabilities, [(Ln + ... + M + ... Rn) / t] > c [(r + ... + Ln-1 + Rn+1 + ... s) / t] for any c, expresses the desired result, namely, that as number of experiments increases, and eventually as n → ∞, it will be more than c times more probable that the observed ratio of successes to failures will be within the bounds (Ln / t) and (Rn / t) around the true probability than outside those bounds. In other words, the number of observations can always grow so large so that it will be more probable that the empirically observed 234 frequencies will approximate rather than deviate from the a priori probability, where what “more probable,” “approximate,” and “deviate” mean can be specified with mathematical precision. However, Bernoulli does not stop here. There is a second part to the original problem, namely, to find the n necessary for a desired level of approximation of the statistical estimate to the a priori probability. He lets R be equal to the ratio of the number of possible successful trials to the number of all possible trials. Since the power nt of the binomial represents the total number of experiments, it follows from the preceding result that so many observations can be made such that the sum of cases in which [(nr – n) / nt] < R < [(nr + n) / nt], or equivalently, [(r – 1) / t] < R < [(r + 1) / t], exceeds the sum of the other cases by more than c times. That is, n not only can grow sufficiently large but its value can also be determined with mathematical exactitude so that the statistical estimate will “approximate” the a priori probability to a precisely specified degree; in the foregoing notation, so that R will be c times more likely to fall within than outside the bounds [(r – 1) / t] and [(r + 1) / t]. At this point, Bernoulli’s demonstrative reasoning ends—the stages of analytical experimentation and observation have finally led to a full demonstration of the general theorem. He proceeds to apply his newly demonstrated result to some specific examples. At this point, I should emphasize a crucial insight that I owe to Daston. She points out, almost in passing, that “Bernoulli’s method of approximating by inequalities the required number of observations nt...was inspired by Archimedes’s approximation of π” 235 (Daston 1988, p. 236). In defending his claim that the estimation of a priori probabilities by way of a posteriori frequencies can be shown sufficient for “moral certainty” even in situations where the number of possible cases involved is infinite, Bernoulli cites historical accomplishments in which mathematical approximations are sufficient for practical use. As his main example, he mentions that “the determinate ratio of the circumference of a circle to its diameter...cannot be expressed accurately except by the infinitely continued decimal places of Ludophus, but...nevertheless, [it] is bounded by Archimedes, Metius, and Ludolphus himself within limits which are very sufficient to practical application” (Bernoulli 1966, p. 43). The proof strategy that Archimedes uses to bound the value of π provided Bernoulli with a plan of attack to demonstrate his famous theorem. A natural way to develop my present study would be to undertake a careful investigation of the nature of Archimedes’s reasoning to ascertain whether it can be aptly characterized as ‘experimental’, in the Peircean sense, and ‘analytical’, in the Cellucian sense, and to assess its heuristic impact on Bernoulli’s reasoning. I regret I cannot undertake this task in detail in this study. However, I do want to make some pertinent remarks here. Archimedes approximates the value of π in Proposition III of De Circuli Dimensione or Measurement of the Circle. Archimedes shows that the ratio of the circumference to the diameter of a circle is less than 3 1/7 but 135 There are at least two 17 Century Latin editions of Archimedes’s works. There first one is Archimedis opera quae extant, David Flurant de Rivualt (Ed.), Paris: Apud Claudium Morellum, 1615; the second is Archimedis opera, Isaac Barrow (Ed.), London: G. Godbid, 1675. I understand that there also is a 17 Century German edition, but I have not been able to find the citation at this time. Any one of these editions may have been used by Bernoulli. It would be interesting to find which one it was, perhaps from Bernoulli’s notebooks and archives at Basel. And the interest would not be only historical; it would be of philosophical relevance because the commentaries of the various editors are illuminating in different ways, and so each one may have influenced Bernoulli’s actual reasoning differently. 236 greater than 3 10/71. Historian of Mathematics Carl Boyer notes that this approximation is far more precise than those of the Egyptians and Babylonians (Boyer 1989, p. 142143). Archimedes’s solution consists of two parts, each concerned with finding one of the bounds. It is reasonable to claim, then, that this proposition was originally a problem to be resolved—namely, “approximate the value of π”—that Archimedes ‘reduced’ it to two problems—namely, “find a lower and an upper bound for π”—and that the formal statement of the proposition and its impeccable, elegant proof is only the formal skeleton revealing an involved analytical process. He finds the upper bound by inscribing a circle within a regular hexagon, then doubling recursively the number of sides of the regular polygon circumscribing the circle, up to a 96-side polygon—and showing successively that the perimeter of each polygon is less than and gradually approximates 3 1/7. Invoking relevant theorems from Euclidean geometry, he can deduce that π is also less than 3 1/7. Archimedes finds the lower bound in a similar fashion, this time starting with a regular hexagon inscribed in a circle, next doubling recursively the number of sides of the inscribed polygon successively up to a 96-side polygon, until he approximates the result that the perimeter of the polygon is greater than 3 10/71. Again invoking Euclidean theorems, he can deduce that π > 3 10/71. Thus, a series of carefully conceived and calculated inequalities lead to the approximation, which Bernoulli considers sufficient for practical use. This is a clear case in which a heuristic method, a plan of attack on a difficult analytical problem, is suggested to a great mathematician by his knowledge of the history 136 We must keep in mind, of course, that ancient geometers never used our modern notation π to denote the ratio of the circumference to the diameter of the circle. 237 of his subject and what it reveals about methods of mathematical ‘experimentation’. It points to the close link between “deep” mathematical knowledge and effective analytical ‘hypothesis-making’. Cellucci speaks to this link when he writes, “every step of the process of reduction establishes new relations between the problem [under investigation] and existing knowledge, as it is necessary from the moment that resolving a problem generally requires us to transcend the problem’s own limits and to explore the relationship between it and other problems. Existing knowledge plays an essential role in the discovery of hypotheses. Naturally, it does not constitute a sufficient basis for finding new hypotheses because these hypotheses go beyond existing knowledge. But no new hypothesis can be found without starting from data, and the facts of existing knowledge—for example, the problems already resolved—are the data for finding hypotheses” (Cellucci 2002, p. 174). Consequently, Cellucci argues that mathematical inquiry cannot be carried out in closed systems. Mathematical problems arise in relation to other problems, often from other mathematical systems, and their analysis necessarily calls for knowledge of problems already resolved in other areas. Moreover, the whole of mathematical knowledge is in a continuous process of development and expansion, so that developments in one area may pose problems in another area and may in fact “deepen” the scope of existing problems. 238 5.2.3 Pragmatic Upshot of the Historical Lessons: Towards a Logic of Mathematical Inquiry For the best mathematicians, I think, the practice of “deepening” their knowledge in a variety of areas, and especially in the history of mathematics, is simply part of the logica utens that prepares them to make better and more effective ‘analytical hypotheses’ through a variety of heuristic methods. This is their practice and no prescriptive logic of inquiry is required to guide them. I want to emphasize, however, the upshot of the foregoing discussion for a logica docens. I think it is safe to claim that it is common among mathematicians, or at least it is the norm in programs of mathematical education, to regard the study of the history of mathematics as being entirely inconsequential to learning to conduct actual mathematical research. In my estimation, this attitude impoverishes the research abilities of students of mathematics. Their training ought to expose them to the reasoning of mathematicians throughout the history of mathematics, at least in some areas. This would not only deepen their existing knowledge, but also cultivate their analytical capacity and train them in heuristic methods. As a student of probability, for example, I had a very narrow view of the possible ways to undertake to prove the law of large numbers. It has been my study of the work of Bernoulli that has “deepened” my understanding of the theorem; as a student, such an investigation could have also fostered my analytical ability and strengthened my grasp on a variety of heuristic methods. Likewise, contemporary students of differential and integral calculus would gain tremendous insight into the power of the analytical methods at their disposal by comparing them to those that Archimedes developed in order to study the properties of a wide range of curves, solids, and geometrical figures. 239 As I have already admitted, my aim here has not been to provide a comprehensive list of heuristic methods for analytical hypothesis-making. I have rather aimed at drawing some of these methods from a historical case study, thereby illustrating as well how a careful study of crucial historical discoveries may lead us to a better understanding of the heuristics of mathematical hypothesis-making. Summing up the case of Bernoulli, in his reasoning we again find an example of experimental analysis conducted by way of a variety of heuristic techniques, most notably a ‘reduction’ by way of an isomorphism— which is a highly formal species of ‘analogy’—, the ‘resolution’ or literal breaking apart of an algebraic array into component parts, and the subsequent series of algebraic ‘deductions’—formal, rule-guided operations that transform one algebraic expression into an equivalent one—involving inequalities between various relevant ratios. Other examples would reveal more techniques. For instance, De Moivre’s gradual improvement on Bernoulli’s theorem, which eventually led him to the normal approximation of the binomial distribution for the case where p = 1/2 involved a heuristic method that Cellucci catalogues as ‘hybridization’. ‘Hybridization’ is “the inference through which the properties of the object of a certain mathematical domain are transferred to the objects of another domain, giving place to a partial superposition of both domains” (Cellucci 2002, p. 285). A typical example may be Descartes’s development of analytic geometry, in which both equations and curves are hybrids that possess at once algebraic and geometrical properties. In the case of De Moivre, he came to conceive of the binomial expansion not only as an algebraic 137 The identification of this method is due to Emily Grosholz. See Grosholz 1992 and Grosholz 2000b, p.88. 240 expression but as a curve. Stigler locates the first indication of this in De Moivre’s 1730 Miscellanea Analytica de Seriebus et Quadraturis (Stigler 1986, p. 76-77; see also Pearson 1926). De Moivre writes, “If the terms of the binomial are though of as set up right, equally spaced at right angles to and above a straight line, the extremities of the terms follow a curve. The curve so described has two inflection points, one on each side of the maximal term” (De Moivre 1730, quoted in Stigler 1986, p. 76-77). The image that arises is that of a curve that today we would recognize as the curve of a normal distribution. The central or maximal term is that which represents the most probable outcome; in the curve, it would be the peak. The terms that gradually move away from the central, maximal term in either direction, represent the gradually less probable outcomes towards the tails of the curve. When De Moivre imaginatively conceives of the binomial expansion as a hybrid, in as much as it also represents a curve, he can investigate its properties as a curve, including its points of inflection. He finds that for the binomial distribution the inflection points are located at a distance (1/2) (n + 2) from the maximal term (Stigler 1986, p. 77; see De Moivre 1730, p. 109-110). Therefore, the ‘hybridization’ of the binomial expansion is the crucial heuristic step that eventually allowed De Moivre to find what we recognize as the first normal approximation to the binomial distribution. The examples could and ought to continue to expand our list of heuristic methods. For now, I want to close our discussion of Bernoulli’s “pure” mathematical reasoning by looking forward to the question of its “applied” scientific upshot. In her assessment of the upshot of Bernoulli’s mathematical theorem, Daston argues that “mathematical conjectures about far more complex and interesting situations like human 241 disease and the weather became possible” because of the theorem (Daston 1988, p. 231). Moreover, she writes that “Bernoulli also drew the attention of mathematicians to the relationship between probabilistic conjecture and inductive reasoning....Because Bernoulli and his eighteenth-century successors equated the a priori probabilities with causes and the a posteriori observations with effects, his theorem became a tool for discovering the probability of causes from effects” (Daston 1988, p. 231-232). However, she charges that even though Bernoulli’s theorem assumes that the true a priori ratio r : s is known, he “did not hesitate to invert his method, without proof or further justification, to find the probability with which [a priori] r and s could be inferred from the observed ratio of fertile to sterile cases” (Daston 1988, p. 236). In other words, she claims that Bernoulli tries to apply the mathematical result to empirical problems without further mathematical or logical warrant. This charge precisely raises the issue for the last part of my study of the logical of mathematical inquiry; namely, how is it logically justifiable to apply ‘ideal’ or ‘hypothetical’ mathematical theories to the scientific study of ‘actual’ events in nature?

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Beyond first order logic: From number of structures to structure of numbers: Part II

We study the history and recent developments in nonelementarymodel theory focusing on the framework of abstractelementary classes. We discuss the role of syntax and semanticsand the motivation to generalize first order model theory to nonelementaryframeworks and illuminate the study with concrete examplesof classes of models. This second part continues to study the question of catecoricitytrans...

متن کامل

The Origin and Limitations of Modern Mathematical Economics: A Historical Approach

We have first demonstrated that Debreu’s view regarding the publication of The Theory of Games and Economic Behavior by von Neumann and Morgenstern in 1944 as the birth of modern mathematical economics is not convincing. In this paper, we have proposed the hypothesis that the coordinated research programs in the 1930’s, initiated by the Econometric Society and the Cowles Commission for Research...

متن کامل

Beyond First Order Logic: From number of structures to structure of numbers: Part I

We study the history and recent developments in nonelementarymodel theory focusing on the framework of abstractelementary classes. We discuss the role of syntax and semanticsand the motivation to generalize first order model theory to nonelementaryframeworks and illuminate the study with concrete examplesof classes of models. This first part introduces the main conceps and philosophies anddiscu...

متن کامل

A Mathematical Model for a Flow Shop Scheduling Problem with Fuzzy Processing Times

This paper presents a mathematical model for a flow shop scheduling problem consisting of m machine and n jobs with fuzzy processing times that can be estimated as independent stochastic or fuzzy numbers. In the traditional flow shop scheduling problem, the typical objective is to minimize the makespan). However,, two significant criteria for each schedule in stochastic models are: expectable m...

متن کامل

Development of Students’ Creativity through Learning Models in Physical Education during the Covid-19 Pandemic

Background. Physical education learning in the era of the COVID-19 pandemic has a remarkable impact on students’ creativity. Objectives. This study aims to determine the effect of applying the inquiry and discovery models in online physical education learning to develop high school students’ creativity. Methods. The multiple treatment and control with the pre and post-test procedure were used...

متن کامل

Probability Model of Decision Making for Successful Transplantation of Non-Cadaveric Organs (RESEARCH NOTE)

Mathematical modeling based on a probabilistic approach for making decisions for organ transplantation can be successfully employed in cases when the choice of decisions can affect the results produced. In this study, the minimum probability of success required for organ transplantion in case of multi-donors is determined. The governing equations are constructed in terms of probabilities and so...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005